Clustering with missing features: a penalized dissimilarity measure based approach
                    
                        
                            نویسندگان
                            
                            
                        
                        
                    
                    
                    چکیده
منابع مشابه
Clustering with Missing Features: A Penalized Dissimilarity Measure based approach
Many real-world clustering problems are plagued by incomplete data characterized by missing or absent features for some or all of the data instances. Traditional clustering methods cannot be directly applied to such data without preprocessing by imputation or marginalization techniques. In this article, we put forth the concept of Penalized Dissimilarity Measures which estimate the actual dista...
متن کاملA New Dissimilarity Measure for Clustering Seismic Signals
Hypocenter and focal mechanism of an earthquake can be determined by the analysis of signals, named waveforms, related to the wave field produced and recorded by a seismic network. Assuming that waveform similarity implies the similarity of focal parameters, the analysis of those signals characterized by very similar shapes can be used to give important details about the physical phenomena whic...
متن کاملClustering and a Dissimilarity Measure for Methadone Dosage Time Series
In this work we analyze data for 314 participants of a methadone study over 180 days. Dosages in mg were converted for better interpretability to seven categories in which six categories have an ordinal scale for representing dosages and one category for missing dosages. We develop a dissimilarity measure and cluster the time series using “partitioning around medoids” (PAM). The dissimilarity m...
متن کاملA dissimilarity measure for the k-Modes clustering algorithm
Clustering is one of the most important data mining techniques that partitions data according to some similarity criterion. The problems of clustering categorical data have attracted much attention from the data mining research community recently. As the extension of the k-Means algorithm, the k-Modes algorithm has been widely applied to categorical data clustering by replacing means with modes...
متن کاملExtending k-Representative Clustering Algorithm with an Information Theoretic-based Dissimilarity Measure for Categorical Objects
This paper aims at introducing a new dissimilarity measure for categorical objects into an extension of k-representative algorithm for clustering categorical data. Basically, the proposed dissimilarity measure is based on an information theoretic definition of similarity introduced by Lin [15] that considers the amount of information of two values in the domain set. In order to demonstrate the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2018
ISSN: 0885-6125,1573-0565
DOI: 10.1007/s10994-018-5722-4